Integrating a Large-Scale, Reusable Lexicon with a Natural Language Generator
نویسندگان
چکیده
This paper presents the integration of a largescale, reusable lexicon for generation with the FUF/SURGE unification-based syntactic realizer. The lexicon was combined from multiple existing resources in a semi-automatic process. The integration is a multi-step unification process. This integration allows the reuse of lexical, syntactic, and semantic knowledge encoded in the lexicon in the development of lexical chooser module in a generation system. The lexicon also brings other benefits to a generation system: for example, the ability to generate many lexical and syntactic paraphrases and the ability to avoid non-grammatical output. 1 I n t r o d u c t i o n Natural language generation requires lexical, syntactic, and semantic knowledge in order to produce meaningful and fluent output. Such knowledge is often hand-coded anew when a different application is developed. We present in this paper the integration of a large-scale, reusable lexicon with a natural language generator, FUF/SURGE (Elhadad, 1992; Robin, 1994); we show that by integrating the lexicon with FUF/SURGE as a tactical component, we can reuse the knowledge encoded in the lexicon and automate to some extent the development of the lexical realization component in a generation application. The integration of the lexicon with FUF/SURGE also brings other benefits to generation, including the possibility to accept a semantic input at the level of WordNet synsets, the production of lexical and syntactic paraphrases, the prevention of nongrammatical output, reuse across applications, and wide coverage. We present the process of integrating the lexicon with FUF/SUR(;E. including how to represenl the lexicon in FUF format, how to unify input with the lexicon incrementally to generate more sophisticated and informative representations, and how to design an appropriate semantic input format so that the integration of the lexicon and FUF/SURGE can be done easily. This paper is organized as follows. In Section 2, we explain why a reusable lexical chooser for generation needs to be developed. In Section 3, we present the large-scale, reusable lexicon which we combined from multiple resources, and illustrate its benefits to generation by examples. In Section 4, we describe the process of integrating the lexicon with FUF/SURGE, which includes four unification steps, with each step adding additional lexical or syntactic information. Other applications and comparison with related work are presented in Section 5. Finally, we conclude by discussing future work. 2 B u i l d i n g a r e u s a b l e l e x i c a l c h o o s e r for g e n e r a t i o n While reusable components have been widely used in generation applications, the concept of a "reusable lexical chooser" for generation remains novel. There are two main reasons why such a lexical chooser has not been developed in the past: 1. In the overall architecture of a generator, the lexical chooser is an internal component that depends on the semantic representation and for.:malism and onthe syntactic realizer used by the application. 2. The lexical chooser links conceptual elements to lexical items. Conceptual elements are by definition domain and application dependent (they are the primitive concepts used in an application knowledge base). These primitives are not easily ported from application to application.
منابع مشابه
Combining Multiple, Large-Scale Resources in a Reusable Lexicon for Natural Language Generation
A lexicon is an essential component in a generation system but few efforts have been made to build a rich, large-scale lexicon and make it reusable for different generation applications. In this paper, we describe our work to build such a lexicon by combining multiple, heterogeneous linguistic resources which have been developed for other purposes. Novel transformation and integration of resour...
متن کاملTwo-Level, Many-Path Generation
Large-scale natural language generation requires the integration of vast mounts of knowledge: lexical, grammatical, and conceptual. A robust generator must be able to operate well even when pieces of knowledge axe missing. It must also be robust against incomplete or inaccurate inputs. To attack these problems, we have built a hybrid generator, in which gaps in symbolic knowledge are filled by ...
متن کاملOntoNL: An Ontology-based Natural Language Interaction Generator for Multimedia Repositories
We propose a generalized implementation framework that can be used to easily provide natural language interactions for managing multimedia content and user profiles which are described in the information repository with metadata structured according to international standards like MPEG-7 and TV-Anytime (upper ontologies). The description of the content in the repository may also utilize metadat...
متن کاملYet Another Head Driven Generator of Natural Language
The paper discusses some basic issues in natural language generation (NLG) and describes a head-driven NLG system for HPSG-like language descriptions. We address mainly the aspects of surface natural language generation starting from a meaning representation for the message meant to be verbalized. The representation of the linguistic knowledge (grammar and lexicon) as well as most important imp...
متن کاملIntegrating connectionist, statistical and symbolic approaches for continuous spoken Korean processing
This paper presents a multi-strategic and hybrid approach for large-scale integrated speech and natural language processing, employing connectionist, statistical and symbolic techniques. The developed spoken Korean processing engine (SKOPE) integrates connectionist TDNN-based phoneme recognition technique with statistical Viterbi-based lexical decoding and symbolic morphological/phonological an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000